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WO 99/26415 PCT/1L98/00545 
METHOD AND SYSTEM FOR PERSONALIZING IMAGES INSERTED INTO A 

VIDEO STREAM 



FIELD OF THE INVENTION 

The present invention relates to systems and methods for inserting 
images into a video stream in general. 

BACKGROUND OF THE INVENTION 

Currently, video services mainly include broadcasting of standard TV 
programs over the air, through cable systems or via satellite. The latter is in the 
form of digital video. Other digital video services are mainly transmitted 
point-to-point via communication networks such as the telephone system or the 
Internet. 

There are two forms of TV advertising. In one, an advertising video clip is 
shown between portions of a TV show or static images are superimposed on a 
portion of a screen. In another, bulletin boards of advertisements are seen. The 
latter is common at sports events where the advertising boards ring the sports 
arena and are seen both by the spectators at the event and by the viewers seeing 
the event via TV. 

Because the sports events occur in one location, the advertising boards 
generally have advertisements thereon of the companies which market in that 
location. However, if the sports event is broadcast by TV. it can be viewed by 
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many other viewers, few of whom are in the locale of the sports event. As a 
result, the advertisement may be ignored by the viewers. 

There exist systems which can insert images into the video stream to 
replace the advertisement on the advertising board. These systems use the 
chroma-key method to replace the portions of each frame of a video stream 
having a specific color or colors. 

One system, described in U.S. Patent 5,491,517 and owned by the 
common assignees of the present invention, inserts images onto specific portions 
of the playing field. The disclosure of U.S. Patent 5.491,517 is : incorporated 
herein by reference. 

The insertion of the images is typically performed at the local 
broadcasting station and. typically, the images to be inserted are advertisements 
for local products. 

At present, digital video services exist mainly in the form of TV using 
direct broadcast from satellite (DBS), such as: directTV in the US, or over the 
Internet as a downloadable file or as a live stream (using" products from Real 
Networks of the USA or VDOnet of the USA). The video stream is mainly 
uni-directional. from the server machine to the client machine. However, in some 
cases, especially for videos downloaded from the Internet, the client has some 
control over the playback. For example, the MediaPlayer of the WINDOWS 
operating system from Microsoft Corporation of the USA, which can be used for 
playing video streams, has pause/stop, rewind and play controls. 
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Advertising on the Internet is limited to messages displayed in fixed areas 
of the page or text. The advertisement is typically in the form of a banner with a 
fixed message or some sort of animation within the banner is also common. 
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SUMMARY OF THE INVENTION 

An object of the present invention is to provide a method and system 
which enables a video service provider to personalize the video stream provided 

5 to each of its clients according to a priori individual knowledge of its clients. 

The present invention generates an user profile from video sequences 
selected by the user from a video server having a plurality of video sequences 
stored therein. The selected video sequence is personalized according to the user 
profile, for transmission to the user. 

o In another embodiment, the video server broadcasts the video sequence, 

accompanied by video parameters for placement of insertable images, to a 
plurality of users, wherein each user has a matched image storage unit, from 
which at least one image can be inserted into the video sequence transmitted by 
the broadcaster. 

5 In accordance with another preferred embodiment of the present 

invention, the system and method also include an image server for generating 

user profiles and for providing images to each image storage unit based on the 

«. . 

r 

user profile. 

Additionally, in accordance with a preferred embodiment of the present 
o invention, the personalization includes associating group profiles with images to 
be implanted, selecting at least one image from among the images to be 
implanted according to the group profile which most closely matches the user 
profile and implanting the selected at least one image into the video sequence. 
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Moreover, in accordance with a preferred embodiment of the present 
invention, all of the personalization can occur in a single processing. Alternatively, 
the implanting can occur in one processing unit and the association and selection 
can occur in a second processing unit. 
5 Finally, the present invention can include receiving user feedback in 

response to the at least one selected and implanted image and providing the user 
feedback to the video server. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention will be understood and appreciated more fully from 
the following detailed description taken in conjunction with the appended drawings 
in which: 

Fig. 1 is a schematic illustration of system for personalizing a video 
sequence, constructed and operative in accordance with a preferred embodiment 
of the present invention, in conjunction with user computers which receive the 
personalized video sequence; 

Fig. 2 is a block diagram illustration of a personalization system forming 
part of the system of Fig. 1 ; 

Fig. 3A is a schematic illustration of transformations, useful in 
understanding the operation of the personalization system of Fig. 2; 

Fig. 3B is a schematic illustration of a permission mask, useful in 
understanding the operation of the personalization system of Fig. 2; 

Fig. 4 is a block diagram illustration of a personalization module forming 
part of the personalization system of Fig. 2; 

Figs. 5A and 5B are schematic illustration of mixing operations, useful in 
understanding the operation of the personalization module of Fig. 4; and 

Figs. 6 and 7 are schematic illustrations of two alternative systems for 
personalizing a video sequence in conjunction with user computers which receive 
the personalized video sequence. 
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DETAILED DESCRIPTION OF THE PRESENT INVENTION 

The present invention is a system for personalizing video based on some 
knowledge (e.g. sex, age, hobbies, etc.) of the individual user requesting the 
video. The personalization can take many forms. It can be an advertisement for 
a company present only in the area where the user lives or works or for a 
company selling products of a type the user is known to like or for any other type 
of product or service which relates to the individual knowledge of the user. There 
can be multiple advertisements. For systems where the user can provide input, 
the personalization can change over time in response to some or all of the user's 
input. 

Reference is now made to Fig. 1 which generally illustrates the operation 
of the system of the present invention. 

The personalization system 10 operates on a video server 11 and 
communicates with a multiplicity of user computers or "clients" 12 (two are 
shown), typically via a network of some kind, such as a local area network and/or 
the Internet. The network communication is typically bi-directional, as described 
hereinbelow. The bi-directional communication can also be formed from a 
broadcast download to the user computers 12 and communication from the user 
computers 12 via the telephone system or the Internet. 

Each user requests whichever video sequence he desires to see. The 
requested video sequence is personalized with advertising images whose 
predefined profile the user fits. For example, company A might want to advertise 
to young men who like to read science fiction books. If the user fits this 
description, his video sequence will include the advertising image or images of 
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company A. Company B might want to advertise to children who recently took an 
algebra course. The video sequence requested by such a child will have 
company B's images implanted therein. 

For example and as shown on the monitors 28 of computers 12, the video 
might include the movement of a person 29 along a street 30 to a building 32. 
For a first user who is known to be a young person, the advertisement might be 
for a drink. Fig. 1 shows a drink bottle 34 on one wall 35 of a building along the 
street, in the monitor labeled 28A. For a user who is known to be a soccer fan, 
the advertisement might be for a sports company. Monitor 28B shows a soccer 
ball 36 on wall 35. 

In this example, both users view the same video but each receives a 
different advertisement, personalized by their user profile. It will be appreciated 
that the two users can view different video sequences but the implanted images 
that they will receive are a function of their profile. 

In order to display the personalized video, the user computer 12 typically 
includes a video unit 14. Such a unit is similar to the REALVIDEO video 
application manufactured by Real Networks with which a user communicates with 
a video server and which receives and displays a video stream. 

Each user's profile is typically created and updated based on his or her 
input. The input can be in answer to a questionnaire, it can be gathered from the 
user's responses to the advertising images previously implanted in his 
personalized video sequence, it can be based on the user's address on the 
network or any other fact about the user which the server 1 1 has the ability to 
gather. 
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The user computer 12 also includes a pointing device 16, such as a 
mouse, a trackball, a touch screen, etc. with which the user can point to objects 
on the screen. The operating system of the user computer 12 monitors the cursor 
movements and selections of the user and provides the location to which the user 
5 pointed to all of the applications on the computer, including the video unit 14. 

Video unit 14 compares the location received from the operating system 
with the area or specific pixel locations of the implanted image, as provided from 
the video server 11. The data from the server 11 is described in more detail 
hereinbelow. 

10 If the user, once he views the personalized video, indicates the implanted 

image using pointing device 16, video unit 14 can transmit an indication of this 
fact, including the name associated with the object, to the video server 11. The 
video server 1 1 typically responds to the user's request and the user identifier 20 
(Fig. 2) uses this information to update the user s profile. 

15 The video server 1 1 can also gather information regarding the responses 

of its users to the various advertising images which the personalization system 10 
implants. This information is valuable feedback to the companies about the 
quality of their advertisements and/or products. 

Reference is now made to Fig. 2 which illustrates the system of the 

20 present invention, to Fig. 4 which details elements of the system of Fig. 2 and to 
Figs. 3A, 3B and 5 which are useful in understanding the operation of the system 
of Fig. 2. 

The personalization system 10 comprises a user identifier 20, a user 
database 21. an object storage unit 22, a video controller 24. a video analyzer 25 
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and a plurality of video personalization modules 26, one per user currently 
receiving a video stream. 

The user identifier 20 operates to identify the client or some aspect of the 
client. The user identifier 20 acquires the identifying information once the 
communication between the user and the server has been initiated. Typically, 
when the user accesses the video server, there is either some login procedure, as 
is common for video servers on the Internet which charge money for their 
services, or some handshaking between the client computer and the server which 
results in the server uniquely identifying the client computer. The login procedure 
can include questions which the user must answer from which the user identifier 
20 builds a profile of the user or it can simply ask the identification of the user. 
The former is common as part of setting up a subscription with the video server. 
The latter is common once the user is a subscriber to the service. 

After logging in, the user identifier 20 then provides the user's profile to 
the object storage unit 22 which, in turn, compares the user's profile to a set of 
profiles predefined by the advertisers for each of their advertising images or sets 
of images. The object storage unit 22 selects the closest stored profile. The 
profiles typically group users by any desired characteristic, or set of 
characteristics, of the user, such as residential area, family data, hobbies, sex, 
etc. or based on the user's previous requests. 

Fig. 2 shows the same example display as Fig. 1, where a young user 
receives a drink advertisement and a soccer fan receives a soccer advertisement. 

Another method for identifying the user utilizes the handshaking 
information, which uniquely identifies the computer 12 to the server. For example, 
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when a computer 12 accesses the video server via the Internet, the computer 12 
has to indicate the Internet Protocol (IP) address at which computer 12 is 
connected to the Internet. These IP addresses are regionally allocated and thus, 
user identifier 20 can determine the approximate geographic location of the 
computer 12 and can use this geographic location as the user profile. 

The user identifier 20 provides the object storage unit 22 with the user 
profile and provides the user's video request to the video controller 24. 

The object storage unit 22 stores the various images to be inserted into 
the videos and organizes them according to the group profile defined for them. 
The personalized data can be a single image to be inserted multiple times and/or 
a set of images to be inserted within the video at different times. The 
personalized data includes a name for each image to be inserted as well as a 
schedule of when and for how long to implant each image. 

In response to a received user profile, object storage unit 22 determines 
the group profile which most closely matches the user profile and outputs the 
images associated with the mafched group profile. If there are more than one set 
of images associated with the matched group profile, the object storage unit 22 
selects one of the sets of images. Furthermore, object storage unit 22 can update 
the user profile to mark which set of images the user has already been seen. 

Video controller 24 selects a video sequence for each user in response to 
his request. The video sequence can either be a stored or a real-time one. The 
video controller 24 also receives video parameters from video analyzer 25 
defining how to implant the images. As described hereinbelow, the video analyzer 
25 generates the parameters from analyzing each frame of the video sequence. 
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This analysis is performed in real-time, if the sequence is received in real-time, 
otherwise, it is performed off-line. 

Object storage unit 22 and video controller 24 both provide their output to 
the personalization module 26 associated with the user. Object storage unit 22 
outputs the personalized data, such as a set of advertisements, associated with 
the user's group and the names associated with each image to be implanted and 
video controller 24 provides the selected video and the associated video 
parameters describing how to transform the personalized data in order to implant 
the personalized data into the video stream. 

Thus, as shown in Fig. 2, each personalization module 26 receives a 
user's requested video, the personalized data to be implanted therein and the 
video parameters. Fig. 2 also shows the output of the two modules 26, assuming 
that both users requested the same video having the street scene. One user has 
the bottle 34 implanted therein and the other has the soccer ball 36 implanted 
therein. 

The images of the personalized data are designed "flat", with no 
perspective in them. One such image 37 is shown in Fig. 3A. However, the 
surfaces on which they are to be implanted, such as a surface 38, are viewed in 
perspective. 

The video analyzer 25 analyzes each frame of the video stream to 
determine the viewing angle of the camera towards the surface on which the 
image is to be implanted and which elements of the frame are foreground and 
which are background. The analysis can be performed in many ways, some of 
which are described in US 5,491,517 and in PCT Publications WO 97/12480, 
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assigned to the common assignee of the present invention, the teachings of which 
are incorporated herein by reference. 

According to U.S. Patent 5,491,517, video analyzer 25 has a flat model 
37 of the surface. Analyzer 25 finds the surface within each frame of the video 
stream and determines a transformation T, per frame, from the flat model 37 to 
the perspective view 38 in each frame. There can be many surfaces which can 
receive an implant and each implant is associated with only one surface. Fig. 3A 
shows two surfaces, two implants 39 to be implanted thereon and the resultant 
perspective implanted images 41. 

Video analyzer 25 also determines which elements of the frame are 
foreground elements and which are background elements. For example, in Fig. 2, 
the person 29 is a foreground element and the walls 35 and building 32 are 
background elements. Typically, the background / foreground information is 
described by a "permission mask" which marks the pixel frame over which the 
frame the image can be implanted, where implantation is only allowed in the 
background areas. Fig. 3B illustrates a permission mask 43 for the video frame 
shown in Fig. 2, where the hatching marks areas where implantation is allowed. 
The permission mask 43 indicates that the area of person 29 (Fig. 2) is the only 
area where implantation is not allowed. 

The transformations T, location information regarding the location of the 
surface within each frame, and permission mask form the video parameters which 
analyzpr 25 produces. 

The personalization modules 26 use each transformation T to transform, 
per frame, the flat images 39 of the personalized data into perspective images 41 
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whose perspective matches that of the surface on which the images are to be 
implanted. The personalization modules 26 then implant the perspective images 
41 into the background of the current frame, thereby producing a personalized 
frame which is transmitted to the user's computer 12. Typically, the 
personalization modules 26 also attach data about the area or pixels of the frame 
where the implanted perspective image is to be found and a name for the 
implanted perspective image. 

Fig. 4 details one video personalization module 26. It comprises a 
personalized data storage unit 38, an image adapter 40, a video personalization 
scheduler 42 and a mixer 44. 

The personalized data storage unit 38 receives the personalized data for 
the personalization module 26 and provides a selected image from the 
personalized data when indicated to do so by the scheduler 42. 

The scheduler 42 receives a predefined schedule, which is associated 
with the personalized data, of when and where to insert an image of the 
personalized data and for how long. The schedule is prepared in advance, 
typically according to advertising considerations. Some advertisers will pay for 
many minutes of insertion while others will pay for minimal amounts. The 
schedule defines when, and for how long, each image will be displayed. The 
scheduler 42 also indicates onto which section of the surface the personalized 
data is to be implanted. 

During operation, the scheduler 42 receives a timing signal by which it 
measures the passage of time, starting from the moment the personalization 
module 26 first receives the video stream. When so indicated by the schedule. 
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the scheduler 42 provides an image selection signal to the storage unit 38 which 
furnishes the selected image to the image adapter 40. At the same time, the 
scheduler 42 provides a location signal to the image adapter 40 to indicate onto 
which section of the surface, if there are more than one, to implant the selected 
image. 

Image adapter 40 also receives the transformation and location 
parameters from the video controller 24 (Fig. 2). Using these parameters, image 
adapter 40 transforms the flat selected image into one with the perspective of the 
current frame, as discussed hereinabove with respect to Fig. 3A. Since the image 
is typically much smaller than a frame, image adapter 40 places the perspective 
image in the proper location within a blank frame. Figs. 5A and 5B show two 
adapted images 49A and 49B. with the two perspective images of Fig. 3A in their 
respective locations within the frame. The hatched portions 50 of Figs. 5A and 5B 
remain blank; only the image portions 52 of Figs. 5A and 5B contain image data. 

Image adapter 40 can be any suitable image adapter which can perform 
the adaptation described hereinabove. Two such image adapters are described 
in U.S. 5,491,517 and WO 97/12480. The two applications also describe suitable 
mixers 44. 

Image adapter 40 also defines the space within which the adapted image 
is placed. This typically is a bounding box around the adapted selected image 
which describes where, in the frame, the adapted image is placed. 

Mixer 44 mixes the adapted image produced by frame adapter 40 with the 
current frame of the video stream to create one frame ( M a personalized frame") of 
a personalized video stream. Mixer 44 typically uses the permission mask 43 
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(Fig. 3B) and the adapted images 49A and 49B (Figs. 5A and 5B) to control the 
mixing process. Wherever the permission mask 43 indicates the presence of the 
background, implantation can occur. 

For implantation, the mixer 44 provides the personalized frame with data 
of the original frame only if the adapted images 49A and 49B is blank for that 
pixel. Wherever the adapted images 49A and 49B has image data and the 
permission mask 43 allows implantation, mixer 44 mixes the image data with the 
data of the original frame. Mixer 44 can just replace the original frame data with 
the image data or it can blend the two, or it can perform any desired other mixing 
operation. 

Mixer 44 also transmits the name associated with the implanted image 
and some indication of its location in the frame. The indication is typically based 
on the bounding box of the adapted image, as determined by the image adapter 
40. However, mixer 44 determines if there are any foreground elements which 
are superimposed over the implanted image and how this affects the shape of the 
implanted image. 

The indication produced by the mixer is an outline of the area within which 
the implanted image sits or a listing of the pixels which include the implanted 
image. This information is transmitted together with the personalized frame. The 
indication can be used by the video unit 14 (Fig. 1) to determine whether or not 
the user has indicated the implanted image with his pointing device 16. 

It will be appreciated that the present invention provides personalized 
video streams based on a priori information about a user. Shown here is a 
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system with a video server, with a plurality of personalization modules, connected 
to a plurality of clients via a data network. 

However, other combinations of servers, personalization modules and 
clients are also incorporated within the scope of the present invention. For 
example, as shown in Fig. 6 to which reference is now briefly made, the 
personalization module can reside on the client side of the network. The video 
server 11 has a personalization preparation system 60 which is similar to the 
personalization system 10 but it does not include any personalization modules 26. 
The latter reside in each of the user computers 12 and are referenced 62. 

In this embodiment, the video server 11 transmits the requested video 
sequences along with the video parameters and the personalized data to the user 
computer 12. The individual personalization modules 62 create the personalized 
videos therefrom, typically "on-the-fly". 

Alternatively, as shown in Fig. 7 to which reference is now briefly made, 
the video sequences can be broadcast, or multi-cast, irrespective of the users 
requests. In this embodiment, the video server 11 has the personalization 
preparation system 60 but only transmits one video sequence and its video 
parameters at a time to the network. Each user computer has a video 
personalization module 62, as in the previous embodiment; however, in this 
embodiment, the modules 62 store the personalized data for a significant period 
of time (in the personalized data storage unit 38 of Fig. 4). Periodically, the 
personalized data is updated, as indicated by the dashed lines of Fig. 7. 

The personalization is, once again, according to the individual 
preferences of the users. These preferences can be provided to the 
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personalization preparation system 60 via any of the methods described 
hereinabove. 

The personalization modules 62 can reside in the user computers 12 or, if 
the network is that of cable or satellite television, in a local "set-top" box which 
5 provides output to a user television. 

It will be appreciated by persons skilled in the art that the present 
invention is not limited by what has been particularly shown and described herein 
above. Rather the scope of the invention is defined by the claims that follow: 
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CLAIMS 

1 . A video personalization system comprising: 

a video server having a plurality of video sequences stored therein; 

means for receiving a selection of a video sequence from a user; 

means for generating a user profile from user information provided 
to said video server; 

a personalization system for personalizing said selected video 
sequence according to said user profile thereby creating a personalized, 
selected video sequence for transmission to said user. 

2. A system according to claim 1 and wherein said personalization 
system includes: 

means for associating group profiles with images to be implanted; 

means for selecting at least one image from among said images to 
be implanted according to the group profile which most closely matches 
said user profile; and 

means for implanting said selected at least one image into said 
video sequence. 

3. A system according to claim 2 and wherein said means for 
implanting are implemented in one processing unit and said means 
for associating and said means for selecting are implemented in a 
second processing unit. 
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4. A system according to claim 1 and also comprising means for 
receiving user feedback in response to said at least one selected 
and implanted image and for providing said user feedback to said 
video server. 

5. A video personalization system comprising: 

a video server having a multiplicity of video sequences stored 
therein; 

a broadcaster for broadcasting a video sequence to a plurality of 
clients, wherein each video sequence is accompanied by video 
parameters for placement of insertable images; 

per client: 

a local image storage unit for storing a plurality of user-matched, 
locally stored images; and 

insertion means for inserting at least one of said locally stored 
images into a video sequence transmitted by said broadcaster. 

6. A system according to claim 5 and also comprising an image server 
for generating user profiles and for providing images to each local 
image storage unit based on the user profile of the user associated 
with each client. 

7. A method for personalizing video sequences, the method comprising 
the steps of: 

having a plurality of video sequences stored in a video server; 
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receiving a selection of a video sequence from a user; 

generating a user profile from user information provided to said 
video server; 

personalizing said selected video sequence according to said user 
profile thereby creating a personalized, selected video sequence for 
transmission to said user. 

8. A method according to claim 7 and wherein said step of 
personalizing includes the steps of: 

associating group profiles with images to be implanted; 

selecting at least one image from among said images to be 
implanted according to the group profile which most closely matches 
said user profile; and 

implanting said selected at least one image into said video 
sequence. 

9. A method according to claim 8 and wherein said step of implanting 
occurs in a first processing unit and wherein said steps of 
associating and selecting occur in a second processing unit. 

10. A method according to claim 7 and also including the steps of 
receiving user feedback in response to said at least one selected 
and implanted image and providing said user feedback to said video 
server. 
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1. A method for personalizing video sequences, the method comprising 
the steps of: 

storing a multiplicity of video sequences stored in a video server; 

broadcasting a video sequence to a plurality of clients, wherein 
each video sequence is accompanied by video parameters for 
placement of insertable images; 

per client: 

storing a plurality of user-matched, locally stored images; and 

inserting at least one of said locally stored images into a video 
sequence transmitted by said broadcaster 

2. A method according to claim 11 and also comprising the steps of 
generating user profiles and providing images to each local image 
storage unit based on the user profile of the user associated with 
each client. 
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